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(b) partial sequences[,] which are at least 14 base pairs in length of [the] a 
sequence[s] defined under (a), 

(c) sequences which hybridize with any of the sequences defined under 
(a) in 2 x SSC at 60°C[, preferably in 0.5 x SSC at 60°C, particularly 
preferably in 0.2 x SSC at 60°C], 

(d) sequences which exhibit at least 70% identity with any of the 
sequences defined under (a)[J between position 1295 and position 
2195 from SEQ ID NO: 1, or between position 432 and position 1318 
from SEQ ID NO: 3, or between position 154 and position 1 123 from 
SEQ ID NO: 5, 

(e) sequences which are complementary to the sequences defined under 
(a), and 

(f) sequences which, [on account of the] due to degeneracy of [the] 
genetic code, encode the same amino acid sequences as the 
sequences defined under (a) , (b), (c) and [to] (d). 

2. (Amended) A [VJvector which comprises at least one nucleic acid 
[according to] of Claim 1 . 

3. (Amended) The [V]vector [according to] of Claim 2, characterized in that 
the nucleic acid is functionally linked to regulatory sequences which ensure [the] 
expression of the nucleic acid in prokaryotic or eukaryotic cells. 

4. (Amended) A [H]host cell which contains a nucleic acid [according to] of 
Claim 1 [or a vector according to Claim 2 or 3]. 

5. (Amended) A [H]host cell [according to] of Claim 4, characterized in that it 
is a prokaryotic or a eukaryotic cell. 

6. (Amended) A [H]host cell [according to] of Claim 5, characterized in that 
the prokaryotic cell is E.coli. 

7. (Amended) A [H]host cell [according to] of Claim 5, characterized in that 
the eukaryotic cell is a mammalian cell or an insect cell. 

8. (Amended) A [PJpolypeptide which is encoded by a nucleic acid 
[according to] of Claim 1 . 

9. (Amended) An ^acetylcholine receptor which comprises at least one 
polypeptide [according to] of Claim 8. 
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10. (Amended) A [P]process for preparing a polypeptide [according to 
Claim 8, which comprises] encoded by a nucleic acid of Claim 1 comprising 

(a) culturing a host cell [according to one of Claims 4 to 7] containing a 
nucleic acid of Claim 1 or a vector comprising at least one nucleic acid 
of Claim 1 under conditions which ensure [the] expression of the 
nucleic acid [according to] of Claim 1 , and 

(b) isolating the polypeptide from the cell or the culture medium. 

1 1 . (Amended) An [A]antibody which reacts specifically with the polypeptide 
[according to] of Claim 8 [or the receptor according to Claim 9], 

12. (Amended) A [Transgenic invertebrate which contains a nucleic acid 
[according to] of Claim 1 . 

13. (Amended) The [Transgenic invertebrate [according to] of Claim 12, 
characterized in that it is Drosophila melanogaster or Caenorhabditis elegans. 

14. (Amended) A [P]process for producing a transgenic invertebrate 
[according to Claim 12 or 13,] which comprises introducing a nucleic acid [according 
to] of Claim 1 or a vector comprising at least one nucleic acid of Claim 1 [according 
to Claim 2 or Claim 3]. 

1 5. (Amended) The [Transgenic progeny of an invertebrate [according to] 
Claim 12 [or 13]. 

16. (Amended) A [P]process for preparing a nucleic acid [according to] of 
Claim 1[, which comprises the following steps:] comprising 

(a) carrying out an entirely chemical synthesis [in a manner known per se,] 
or 

(b) chemically synthesizing an oligonucleotide^], labelling the 
oligonucleotide^], hybridizing the oligonucleotide^] to the DNA of an 
insect cDNA library, selecting a positive clone[s] and isolating the 
hybridizing DNA from a positive clonefs], or 

(c) chemically synthesizing an oligonucleotide^] and amplifying the target 
DNA by means of PCR. 

17. (Amended) The [R]regulatory region which naturally controls 
transcription of a nucleic acid [according to] of Claim 1 in insect cells and ensures 
specific expression. 
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18. (Amended) A [PJprocess for discovering novel active compounds for 
plant protection, in particular compounds which alter the conducting properties of an 
acetylcholine receptor[s according to Claim 9] made up of at least one polypeptide 
encoded by a nucleic acid of Claim 1 , which comprises the following steps: 

(a) providing a host cell [according to one of Claims 4 to 7] containing a 
nucleic acid of Claim 1 or a vector comprising at least one nucleic acid 
of Claim 1 , 

(b) culturing the host cell in the presence of at least one [a] compound [or 
a sample which comprises a multiplicity of compounds], and 

(c) detecting altered receptor properties. 

19. (Amended) A [P]process for discovering a compound which binds to an 
acetylcholine receptor[s according to Claim 9, which encompasses the following 
steps:] comprising 

(a) bringing a host cell [according to one of Claims 4 to 7] containing a 
nucleic acid of Claim 1 or a vector comprising at least one nucleic acid 
of Claim 1 , a polypeptide [according to Claim 8] encoded by a nucleic 
acid of Claim 1 or [a] an acetylcholine receptor [according to Claim 9] 
comprising at least one polypeptide encoded by a nucleic acid of Claim 
1_ into contact with [a] at least one compound [or a mixture of 
compounds] under conditions which permit interaction of the 
compound [compound(s)] with the host cell, the polypeptide or the 
receptor, and 

(b) determining the compound [compound(s)] which bind [bind(s)] 
specifically to the receptorfs]. 

20. (Amended) A [P]process for discovering compounds which alter the 
expression of an acetylcholine receptor comprising at least one polypeptide encoded 
by a nucleic acid of Claim 1 [receptors according to Claim 9,] which comprises the 
following steps: 

(a) bringing a host cell [according to one of Claims 4 to 7] containing a 

nucleic acid of Claim 1 or a vector comprising at least one nucleic acid 
of Claim 1 or a transgenic invertebrate containing a nucleic acid of 
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Claim 1 [according to Claim 1 1 or Claim 12] into contact with [a] at 
least one compound [or a mixture of compounds], 

(b) determining the receptor concentration, and 

(c) determining the [compound(s)] compound which specifically 
[influence(s)] influences the expression of the receptor. 

Please add the following new Claims 22-34: 

-22. The nucleic acid of Claim 1 which comprises a sequence that hybridizes 
with a sequence defined under (a) in 0.5 x SSC at 60°C. 

23. The nucleic acid of Claim 1 which comprises the sequence that 
hybridized with a sequence defined in (a) in 0.2 x SSC at 60°C. 

24. A host cell containing the vector of Claim 2. 

25. A host cell containing the vector of Claim 3. 

26. The host cell of Claim 24 that is a prokaryotic or a eukaryotic cell. 

27. The host cell of Claim 25 that is prokaryotic or a eukaryotic cell. 

28. The host cell of Claim 26 that is an E. coli cell. 

29. The host cell of Claim 27 that is an E. coli cell. 

30. The host cell of Claim 26 that is a mammalian or an insect cell. 

31 . The host cell of Claim 27 that is a mammalian or an insect cell. 

32. An antibody which reacts specifically with the acetylcholine receptor of 
Claim 9. 

33. A transgenic progeny of the invertebrate of Claim 1 3.-- 

REMARKS 

The specification has been amended at page 4 to insert a Brief Description of 
the Drawing. This description corresponds to that given at page 8, lines 19-20 of the 
specification. 

The specification has also been amended at page 16 to change the heading 
from "References" to "Prior Art" to more accurately reflect the status of these 
disclosures. 

Claim 21 has been cancelled. 

Claims 1-20 have been rewritten to place them in better grammatical form 
and to remove the multiple dependencies which occurred therein. 
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New Claims 22 and 23 are directed to subject matter deleted from original 
Claim 1. 

New Claims 24 and 25 are directed to subject matter deleted from original 
Claim 4. 

New Claims 26 and 27 are directed to subject matter deleted from original 
Claim 5. 

New Claims 28 and 29 are directed to subject matter deleted from original 
Claim 6. 

New Claims 30 and 31 are directed to subject matter deleted from original 
Claim 7. 

New Claim 32 is directed to subject matter deleted from original Claim 1 1 . 
New Claim 33 is directed to subject matter deleted from original Claim 14. 
An Action on the merits of this case is respectfully requested. 

Respectfully submitted, 



MARTIN ADAMCZEWSKI 
NADJA OELLERS 
THOMAS SQfcjULTE 
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Nucleic acids which encode insect acetylcholine receptor subun its 

The invention relates, in particular, to nucleic acids which encode insect acetyl- 
choline receptor subunits. 

Nicotinic acetylcholine receptors are ligand-regulated ion channels which are of im- 
portance in neurotransmission in the animal kingdom. The binding of acetylcholine 
or other agonists to the receptor induces a transient opening of the channel and al- 
lows cations to flow through. It is assumed that a receptor consists of five subunits 
which are grouped around a pore. Each of these subunits is a protein which consists 
of an extracellular N-terminal moiety followed by three transmembrane regions, an 
intracellular moiety, a fourth transmembrane region and a short extracellular C-ter- 
minal moiety (Changeux et al. 1992). 

Acetylcholine receptors are especially well investigated in vertebrates. In this con- 
text, three groups can be distinguished on the basis of their anatomical location and 
their functional properties (conducting properties of the channel, desensitization, and 
sensitivity towards agonists and antagonists and also towards toxins such as 
a-bungarotoxin). The classification correlates with the molecular composition of the 
receptors. There are heterooligomeric receptors having the subunit composition 
a 2 By8, which are found in muscle (Noda et al. 1982, Claudio et al. 1983, Devillers- 
Thiery et al. 1983, Noda et al. 1983a, b), heterooligomeric receptors which contain 
subunits from the ct2 - a6 and B2 - B4 groups and which are found in the nervous 
t system (Wada et al. 1988, Schoepfer et al. 1990, Cockcroft et al. 1991, Heinemann et 
| al. 1997), and also homooligomeric receptors which contain subunits from the a7 - 

1 

I a9 group and which are likewise found in the nervous system (Lindstrom et al. 1997, 

ca 

| Elgoyhen et al. 1997). This classification is also supported by an examination of the 
I relatedness of the gene sequences of the different subunits. Typically, the sequences 
of functionally homologous subunits from different species are more similar to each 
other than are sequences of subunits which are from different groups but from the 
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same species. Thus, the rat muscle a subunit, for example, exhibits 78% amino acid 
identity and 84% amino acid similarity with that of the electric ray Torpedo califor- 
nica but only 48% identity and 59% similarity with the rat ot2 subunit (hetero- 
oligomeric, neuronal) and 36% identity and 45% similarity with the rat a7 subunit 
5 (homooligomeric, neuronal). Furthermore, the gene sequences of all the known ace- 
tylcholine receptor subunits are to a certain extent similar not only to each other but 
also to those of some other ligand-regulated ion channels (e.g. the serotonin receptors 
of the 5HT 3 type, the GABA-regulated chloride channels and the glycine-regulated 
chloride channels). It is therefore assumed that all these receptors are descended from 
1 0 one common precursor and they are classified into one supergene family (Ortells et 
al. 1995). 

In insects, acetylcholine is the most important excitatory neurotransmitter of the cen- 
tral nervous system. Accordingly, acetylcholine receptors can be detected electro- 

15 physiologically in preparations of insect central nervous system ganglia. The recep- 
tors are detected both in postsynaptic and presynaptic nerve endings and in the cell 
bodies of interneurones, motor neurones and modulatory neurones (Breer et al. 1987, 
Buckingham et al. 1997). Some of the receptors are inhibited by a-bungarotoxin 
while others are insensitive (SchloB et al. 1988). In addition, the acetylcholine re- 

20 ceptors are the molecular point of attack for important natural (e.g. nicotine) and 
synthetic insecticides (e.g. chloronicotinyls). 

The gene sequences of a number of insect nicotinic acetylcholine receptors are al- 
ready known. Thus, the sequences of five different subunits have been described in 

25 Drosophila melanogaster (Bossy et al. 1988, Hermanns-Borgmeyer et al. 1986, 
Sawruk et al. 1990a, 1990b, Schulz et al. Unpublished, EMBL accession number 
Y15593), while five have likewise been described in Locusta migratoria (Stetzer et 
al. unpublished, EMBL accession numbers AJ000390 - AJ000393), one has been 
described in Schistocerca gregaria (Marshall et al. 1990), two have been described in 

30 Myzus persicae (Sgard et al. unpublished, EMBL accession number X81887 and 
X81888), and one has been described in Manduca sexta (Eastham et al. 1997). Fur- 
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thermore, a number of partial gene sequences from Drosophila melanogaster have 
been characterized as so-called expressed sequence tags (Genbank accession numbers 
AA540687, AA698155, AA697710, AA697326). The fact that individual sequences 
are very similar to those from other insects suggests that these subunits are functional 
homologues. 

It is of great practical importance to make available new insect acetylcholine receptor 
subunits, for example for the purpose of searching for novel insecticides, with those 
subunits which differ from the known subunits to a greater extent than is the case 
between functional homologues being particularly of interest. 

The present invention is consequently based, in particular, on the object of making 
available nucleic acids which encode novel insect acetylcholine receptor subunits. 

This object is achieved by the provision of nucleic acids which comprise a sequence 
selected from 

(a) the sequences according to SEQ ID NO: 1, SEQ ID NO: 3 or SEQ ID NO: 5, 

(b) part sequences of the sequences defined in (a) which are least 14 base pairs in 
length, 

(c) sequences which hybridize to the sequences defined in (a) in 2 x SSC at 60°C, 
preferably in 0.5 x SSC at 60°C, particularly preferably in 0.2 x SSC at 60°C 
(Sambrook et al. 1989), 

(d) sequences which exhibit at least 70% identity with the sequences defined in 
(a), between position 1295 and position 2195 in the case of SEQ ID NO: 1, or 
between position 432 and position 1318 in the case of SEQ ID NO: 3, or be- 
tween position 154 and position 1 123 in the case of SEQ ID NO: 5, 
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(e) sequences which are complementary to the sequences defined in (a), and 

(f) sequences which, because of the degeneracy of the genetic code, encode the 
same amino acid sequences as the sequences defined in (a) to (d). 

5 

The degree of identity of the nucleic acid sequences is preferably determined using the 
GAP program from the GCG program package, Version 9.1 with standard settings 
(Devereux et al 1984). 

10 The present invention is based on the surprising finding that insects possess genes 
which encode subunits of, in particular, homooligomeric acetylcholine receptors. 

The invention furthermore relates to vectors which contain at least one of the novel 
nucleic acids. All the plasmids, phasmids, cosmids, YACs or artificial chromosomes 

15 which are used in molecular biological laboratories can be used as vectors. These vec- 
tors can be linked to the usual regulatory sequences for the purpose of expressing the 
novel nucleic acids. The choice of such regulatory sequences depends on whether pro- 
karyotic or eukaryotic cells, or cell-free systems, are used for the expression. The 
SV40, adenovirus or cytomegalovirus early or late promoter, the lac system, the trp 

20 system, the main operator and promoter regions of phage lambda, the control regions 
of the fd coat protein, the 3 -phosphogly cerate kinase promoter, the acid phosphatase 
promoter and the yeast a-mating factor promoter are examples of expression control 
sequences which are particularly preferred. 

25 In order to be expressed, the nucleic acids according to the invention can be introduced 
into suitable host cells. Both prokaryotic cells, preferably E.coli, and eukaryotic cells, 
preferably mammalian or insect cells, are suitable for use as host cells. Other examples 
of suitable unicellular host cells are: Pseudomonas, Bacillus, Streptomyces, yeasts, 
HEK-293, Schneider S2, CHO, COS1 and COS7 cells, plant cells in cell culture and 

30 also amphibian cells, in particular oocytes. 
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The present invention also relates to polypeptides which are encoded by the nucleic 
acids according to the invention and also the acetylcholine receptors, preferably ho- 
mooligomeric acetylcholine receptors, which are synthesized from them. 

5 In order to prepare the polypeptides which are encoded by the nucleic acids according 
to the invention, host cells which contain at least one of the nucleic acids according to 
the invention can be cultured under suitable conditions. After that, the desired polypep- 
tides can be isolated from the cells or the culture medium in a customary manner. 

10 The invention furthermore relates to antibodies which bind specifically to the above- 
mentioned polypeptides or receptors. These antibodies are prepared in the customary 
manner. For example, such antibodies can be produced by injecting a substantially im- 
munocompetent host with a quantity of an acetylcholine receptor polypeptide, or a 
fragment thereof, according to the invention which is effective for producing anti- 

1 5 bodies, and subsequently isolating these antibodies. Furthermore, an immortalized cell 
line which produces monoclonal antibodies can be obtained in a manner known per se. 
Where appropriate, the antibodies can be labelled with a detection reagent. Preferred 
examples of such a detection reagent are enzymes, radioactively labelled elements, 
fluorescent chemicals or biotin. Instead of the complete antibody, use can also be made 

20 of fragments which possess the desired specific binding properties. 

The nucleic acids according to the invention can be used, in particular, for producing 
transgenic invertebrates. These latter can be employed in test systems which are based 
on an expression of the receptors according to the invention, or variants thereof, which 
25 differs from that of the wild type. In addition, this includes all transgenic invertebrates 
in which a change in the expression of the receptors according to the invention, or their 
variants, occurs as the result of modifying other genes or gene control sequences (pro- 
moters). 
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The transgenic invertebrates are produced, for example, in Drosophila melanogaster by 
means of P element-mediated gene transfer (Hay et al., 1997) or in Caenorhabditis ele- 
gans by means of transposon-mediated gene transfer (e.g. using Tel, Plasterk, 1996). 

5 The invention also consequently relates to transgenic invertebrates which contain at 
least one of the nucleic acid sequences according to the invention, preferably to trans- 
genic invertebrates of the species Drosophila melanogaster or Caenorhabditis elegans, 
and to their transgenic progeny. Preferably, the transgenic invertebrates contain the 
receptors according to the invention in a form which differs from that of the wild type. 

10 

The nucleic acids according to the invention can be prepared in the customary manner. 
For example, the nucleic acid molecules can be synthesized entirely chemically. In 
addition, only short segments of the sequences according to the invention can be syn- 
thesized chemically and these oligonucleotides can be labelled radioactively or with a 
15 fluorescent dye. The labelled oligonucleotides can be used to screen cDNA libraries 
prepared from insect mRNA. Clones which hybridize to the labelled oligonucleotides 
("positive clones") are selected for isolating the relevant DNA. After the isolated DNA 
has been characterized, the nucleic acids according to the invention are readily ob- 
tained. 

20 

The nucleic acids according to the invention can also be prepared by means of PCR 
methods using chemically synthesized oligonucleotides. 

The nucleic acids according to the invention can be used for isolating and characteriz- 
25 ing the regulatory regions which occur naturally adjacent to the coding region. Conse- 
quently, the present invention also relates to these regulatory regions. 

The nucleic acids according to the invention can be used to identify novel active com- 
pounds for plant protection, such as compounds which, as modulators, in particular as 
30 agonists or antagonists, alter the conducting properties of the acetylcholine receptors 
according to the invention. For this, a recombinant DNA molecule, which encompasses 



Le A 33 020-Foreign Countries 

-7- 

at least one nucleic acid according to the invention, is introduced into a suitable host 
cell. The host cell is cultured, in the presence of a compound or a sample which com- 
prises a multiplicity of compounds, under conditions which permit expression of the 
receptors according to the invention. A change in the receptor properties can be de- 
5 tected, as described below in Example 2. Using this approach, it is possible to discover 
insecticidal substances. 

The nucleic acids according to the invention also make it possible to discover com- 
pounds which bind to the receptors according to the invention. These compounds can 
10 likewise be used as insecticides on plants. For example, host cells which contain the 
nucleic acid sequences according to the invention and express the corresponding re- 
ceptors or polypeptides, or the gene products themselves, are brought into contact with 
a compound or a mixture of compounds under conditions which permit the interaction 
of at least one compound with the host cells, receptors or the individual polypeptides. 

15 

Host cells or transgenic invertebrates with contain the nucleic acids according to the 
invention can also be used to discover substances which alter the expression of the re- 
ceptors. 

20 The above-described nucleic acids, vectors and regulatory regions according to the in- 
vention can additionally be used for discovering genes which encode polypeptides 
which are involved in the synthesis, in insects, of functionally similar acetylcholine 
receptors. According to the present invention, functionally similar receptors are under- 
stood as being receptors which encompass polypeptides which, while differing in their 

25 amino acid sequences from the polypeptides described in this present publication, es- 
sentially possess the same functions. 
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Comments on the sequence listing and the figures: 

SEQ ID NO: 1 shows the nucleotide sequence of the isolated Da7 cDNA, beginning 
with position 1 and ending with position 2886. SEQ ID NO: 1 and SEQ ID NO: 2 
also show the amino acid sequences of the protein deduced from the Da7 cDNA se- 
quence. 

SEQ ID NO: 3 shows the nucleotide sequence of the isolated Hva7-1 cDNA, begin- 
ning with position 1 and ending with position 3700. SEQ ID NO: 3 and SEQ ID NO: 
4 also show the amino acid sequences of the protein deduced from the Hva7-1 cDNA 
sequence. 

SEQ ID NO: 5 shows the nucleotide sequence of the isolated Hva7-2 cDNA, begin- 
ning with position 1 and ending with position 3109. SEQ ID NO: 5 and SEQ ID NO: 
6 also show the amino acid sequences of the protein deduced from the Hva7-2 cDNA 
sequence. 

Figure 1 shows the increase in intracellular calcium which occurs in cells which have 
been recombinantly modified as described in Example 2 following the addition of 
nicotine. Cells were loaded with Fura-2-acetoxymethyl ester (5-10 jiM in serum- 
free minimal essential medium containing 1% bovine serum albumin and 5 mM cal- 
cium chloride), washed with Tyrode solution buffered with N-(2- 
hydroxyethyl)piperazine-N'-(2-ethanesulphonic acid) (5 mM HEPES) and alternately 
illuminated, under a fluorescence microscope (Nikon Diaphot) with light of 340 nm 
and 380 nm wavelength. A measurement point corresponds to a pair of video images 
at the two wavelengths (exposure time per image, 100 ms). The time interval be- 
tween two measurement points is 3 s. After 8 images had been taken (measurement 
point 4.0), nicotine was added to a final concentration of 500 \xM and the measure- 
ment series was continued. The fluorescence intensity of the cells when illuminated 
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with light of 380 nm wavelength was divided by the corresponding intensity at 340 
nm, thereby giving the ratio. 

Examples: 

5 

Example 1 

Isolating the described polynucleotide sequences 

10 Polynucleotides were manipulated using standard methods of recombinant DNA 
technology (Sambrook, et al. ? 1989). The bioinformatic processing of nucleotide and 
protein sequences was carried out using the GCG program package Version 9.1 
(GCG Genetics Computer Group, Inc., Madison Wisconsin, USA). 

1 5 Partial polynucleotide sequences 

Sequence comparisons ("Clustalw") were used to identify regions, from which de- 
generate oligonucleotides were deduced by backtranslating the codons, of protein 
sequences from genes whose ability to form homooligomeric acetylcholine receptors 
20 was known. In all, 5 such oligonucleotide pairs were selected for the polymerase 
chain reaction (PCR). Only one combination (see below) gave a product both from 
Heliothis cDNA and from Drosophila cDNA. 

RNA was isolated from whole Heliothis virescens embryos (shortly before hatching) 
25 using Trizol reagent (Gibco BRL, in accordance with the manufacturer's instruc- 
tions). The same procedure was adopted with Drosophila embryos (24 h at 25°C). 10 
Hg of these RNAs were employed in a first cDNA strand synthesis (Superscript Pre- 
amplification System for first cDNA strand synthesis, Gibco BRL, in accordance 
with the manufacturer's instructions, reaction temperature 45°C). 

30 



Le A 33 020-Foreign Countries 



- 10- 

Subsequently, 1/100 of the abovementioned first-strand cDNA was in each case em- 
ployed in a polymerase chain reaction (PCR) using the oligonucleotides alpha7-ls: 
(5 '-GAYGTIGAYGARAARAAYCA-3 ') and alpha7-2a: (5 

C YYTCRTCIGCRCTRTTRTA-3 ') (recombinant Taq DNA polymerase, Gibco 
5 BRL). The PCR parameters were as follows: Hva7-1 and Hva7-2: 94°C, 2 min; 35 
times (94°C> 45 s; 50°C, 30 s; 72°C, 60 s) and also Da7: 96°C, 2 min; 35 times 
(96°C, 45 s; 50°C, 30 s; 72°C, 60 s). In each case, this resulted in a dectable band of 
approx. 0.2 kb in an agarose gel (1%), both in the case of Drosophila cDNA and in 
the case of Heliothis cDNA. After the DNA fragments had been subcloned by means 
10 of SrfScript (Stratagene), and their sequences had been determined, it turned out that 
two different DNA fragments had been amplified from Heliothis cDNA; these were 
228-1 1 = Hva7-1 (partial, containing 165 bp) and 228-8 = Hva7-2 (partial, containing 
171 bp). Only one DNA fragment was isolated from Drosophila cDNA; this was 
248-5 = Da7 (partial, containing 150 bp). 

15 

Isolating poly A-containing RNA from Heliothis virescens tissue and constructing 
the cDNA libraries 

The RNA for cDNA library I was isolated from whole Heliothis virescens embryos 
20 (shortly before hatching) using Trizol reagent (Gibco BRL, in accordance with the 
manufacturer's instructions). The RNA for cDNA library II was isolated from whole 
head ganglia from 500 Heliothis virescens larvae (stages 4-5) usings Trizol reagent 
(Gibco BRL, in accordance with the manufacturer's instructions). The poly A-con- 
taining RNAs were then isolated from these RNAs by purifying with Dyna Beads 
25 280 (Dynal). 5 (ig of these poly A-containing RNAs were subsequently employed in 
constructing cDNA libraries I and II using the A,-ZAPExpress vector (cDNA Syn- 
thesis Kit, ZAP-cDNA Synthesis Kit and ZAP-cDNA Gigapack III Gold Cloning 
Kit, all from Stratagene). In a departure from the manufacturer's instructions, Super- 
script Reverse Transcriptase (Gibco BRL) was used for synthesizing the cDNA at a 
30 synthesis temperature of 45°C. In addition, radioactively labelled deoxynucleoside 
triphosphates were not added. Furthermore, the synthetisized cDNAs were not frac- 



Le A 33 020-Foreign Countries 



- 11 - 

tionated through the gel filtration medium contained in the kit but instead through 
Size Sep 400 Spun Columns (Pharmacia). 

Complete polynucleotide sequences 

5 

Apart from the first screening round when isolating the Hva7-1 clone, all the screens 
were carried out using the DIG system (all reagents and consumables from 
Boehringer Mannheim, in accordance with the instructions in "The DIG System 
User's Guide for Filter Hybridization", Boehringer Mannheim). The DNA probes 

10 employed were prepared by means of PCR using digoxigenin-labelled dUTP. The 
hybridizations were carried out at 42°C overnight in DIG Easy Hyb (Boehringer 
Mannheim). Labelled DNA was detected on nylon membranes by means of chemi- 
luminescence (CDP-Star, Boehringer Mannheim) using X-ray films (Hyperfilm MP, 
Amersham). Initial partial sequencing of the isolated gene library plasmids was car- 

15 ried out, for identification purposes, using T3 and T7 primers (ABI Prism Dye Ter- 
minator Cycle Sequencing Kit, ABI, using an ABI Prism 310 Genetic Analyzer). The 
complete polynucleotide sequences in Hva7-1, Hva7-2 and Da7 were determined, as 
a commissioned sequencing carried out by Qiagen, Hilden, by means of primer 
walking using cycle sequencing. 

20 

a. Isolating the Da7 clone 

10 6 phages from a Drosophila melanogaster cDNA library in X phages (Canton-S 
embryo, 2-14 hours, in Uni-ZAP XR vector, Stratagene) were screened using DIG- 

25 labelled 248-5 as the probe (in accordance with the manufacturer's (Stratagene) in- 

structions). The maximum stringency when washing the filters was: 0.2 x SSC; 0.1% 
SDS; 42°C; 2x15 min. One clone (clone 432-1) was isolated whose insert had a size 
of 2940 bp (Da7, SEQ ID NO: 1). The largest open reading frame of this sequence 
begins at position 372 of the depicted sequence and ends at position 1822. The 770 

30 amino acids polypeptide which is deduced from this (SEQ ID NO: 2) has a calculated 
molecular weight of 87.01 kD. 
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b. Isolating the Hva7-1 clone 

10 6 phages from the Heliothis virescens embryo cDNA library (library I) were in- 
5 eluded in the screening. The first of three screening rounds took place using oc- 32 P- 
labelled 228-1 1 DNA as the probe. The probe was hybridized to the filters in Quick- 
hyb (Stratagene) at 68°C for one hour. The filters were then washed twice, for 15 min 
on each occasion, at room temperature in 2 x SSC; 0.1% SDS and twice, for 30 min 
on each occassion, at 42°C in O.lxSSC; 0,1% SDS. Hybridized probes were detected 
10 by means of autoradiography, at -80°C overnight, using XR X-ray films (Kodak) and 
employing intensifying screens (Amersham). The two further screening rounds were 
carried out using the DIG System (Boehringer Mannheim). 

The clone 241-5, which was isolated in this screen, contained an insert of 3630 bp. 
15 This insert (Hva7-1, SEQ ID NO: 3) possesses a longest open reading frame which 
begins at position 335 of the depicted nucleic acid sequence and ends at position 
1821. The 496 amino acids polypeptide which is deduced from this (SEQ ID NO: 4) 
has a calculated molecular weight of 56.36 kD. 

20 c. Isolating the Hva7-2 clone 

10 6 phages from the Heliothis virescens ganglia cDNA library (library II) were in- 
cluded in the screening. Dig-labelled 228-8 DNA was used as the probe. The maxi- 
mum stringency when washing the filters was: 0.1 x SSC; 0.1% SDS; 42°C; 2x15 
25 min. 

The clone 241-5, which was isolated in this screen, contained an insert of 3630 bp. 
This insert (Hva7-2, SEQ ID NO: 5) possesses a longest open reading frame which 
begins at position 95 of the depicted nucleic acid sequence and ends at position 1598. 
30 The 501 amino acids polypeptide which is deduced from this (SEQ ID NO: 6) has a 
calculated molecular weight of 56.71 kD. 
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Example 2 

Generating the expression constructs 

5 

a. Da7 

The sequence region from position 372 to position 2681 of SEQ ID NO: 1 was am- 
plified by means of a polymerase chain reaction (PCR). Deoxyoligonucleotides hav- 

10 ing the sequences GCGAATTCACCACCATGAAAAATGCACAACTG and 
CGAGACAATAATATGTGGTGCCTCGAG were used for this. The Pfu polymer- 
ase from Stratagene was used as the DNA polymerase in accordance with the manu- 
facturer's instructions. Following the amplification, the segment which had been 
generated was digested with the restriction endonucleases Eco RI and Xho I and 

15 cloned into a vector, i.e. pcDNA3.1/Zeo (Invitrogen), which had likewise been di- 
gested with Eco RI and Xho L 

b. Hva7-1 

20 The sequence region from position 335 to position 1822 from SEQ ID NO: 3 was 
amplified by means of a polymerase chain reaction (PCR). Deoxyoligonucleotides 
having the sequences 

GCAAGCTTACCACCATGGGAGGTAGAGCTAGACGCTCGCAC and 
GCCTCGAGCGACACCATGATGTGTGGCGC were used for this. The Pfu poly- 
25 merase from Stratagene was used as the DNA polymerase in accordance with the 
manufacturer's instructions. Following amplification, the generated segment was 
digested with the restriction endonucleases Hindlll and Xho I and cloned into a vec- 
tor, i.e. pcDNA3.1/Zeo (Invitrogen), which had likewise been digested with Hindlll 
and Xho I. 



30 
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c. Hva7-2 

The sequence region from position 95 to position 1597 from SEQ ID NO: 5 was 
amplified by means of a polymerase chain reaction (PCR). Deoxyoligonucleotides 
5 having the sequences GCAAGCGCCGCTATGGCCCCTATGTTG and 
TTGCACGATGATATGCGGTGCCTCGAGCG were used for this. The Pfu poly- 
merase from Stratagene was used as the DNA polymerase in accordance with the 
manufacturer's instructions. Following amplification, the generated segment was 
digested with the restriction endonucleases Hindlll and Xho I and cloned into a vec- 
10 tor, i.e. pcDNA3.1/Zeo (Invitrogen), which had likewise been digested with Hindlll 
and Xho I. 

d. Hva7-l / 5HT 3 and Hva7-2 / 5HT 3 chimaeras 

15 The region from position 335 to position 1036 from SEQ ID NO: 3 (Hva7-1/5HT 3 
chimaera) and the region from position 95 to position 763 from SEQ ID NO: 5 
(Hva7-2/5HT 3 chimaera) was in each case fused to the region from position 778 to 
position 1521 from the Mus musculus 5-HT 3 receptor cDNA (sequence in EMBL 
database: M774425) using the method of overlap extension (Jespersen et al. 1997). 

20 The two fragments were subsequently cloned into the pcDNA3.1/Zeo vector by 
means of TA cloning (Invitrogen, in accordance with the manufacturer's instruc- 
tions). Constructs containing the correct orientation of the two fragments in the vec- 
tor were identified by sequencing using the T7 primer (Invitrogen). 

25 Cell culture and gene transfer 

HEK293 cells, which express the a subunit of an L-type Ca channel (Zong et al. 
1995, Stetzer et al. 1996), were cultured in Dulbecco's modified Eagle's medium and 
10% foetal calf serum at 5% C0 2 and from 20°C to 37°C. FuGENE 6 (Boehringer 
30 Mannheim GmbH, Mannheim, Germany) was used for the gene transfer in accor- 
dance with the manufacturer's instructions. At from 24 h to 48 h after the gene trans- 
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fer, the cells were sown at various densities in microtitre plates. Recombinantly al- 
tered cells were selected by growth in Dulbecco's modified Eagle's medium and 
10% foetal calf serum and 150 - 500 [ig/ml of Zeocin/ml over a period of from 3 to 4 
weeks. Individual resistant clones were analyzed as described below. 

5 

Fura-2 measurements 

The alterations in the intracellular calcium concentration were measured using Fura- 
2. A stock solution containing 2 mM Fura-2-acetoxy methyl ester (Sigma) in di- 

10 methyl sulphoxide (DMSO) was diluted to a final concentration of 5 - 10 [iM in se- 
rum-free minimal essential medium (MEM, Gibco) containing 1% bovine serum 
albumin and 5 mM calcium chloride. The cells were incubated for from 45 to 60 min 
in this solution in a microtitre plate. The cells were then washed twice in Tyrode so- 
lution buffered with N-(2-hydroxyethyl)piperazine-N , -(2-ethanesulphonic acid) (5 

15 mM HEPES) (HEPES -buffered salt solution containing 130 mM NaCl, 5 mM KC1, 2 
mM CaCl 2 , 1 mM MgCl 2 , 5 mM NaHC0 3 , 10 mM glucose). 100 jal Tyrode buffer 
were added to the wells of the microtitre plate and the cells were illuminated alter- 
nately, under a fluorescence microscope (Nikon Diaphot), with light of 340 nm and 
380 run wavelength. A series of video images (exposure time per image 100 ms) 

20 were taken with pauses of 3 seconds and stored, as digitalized images, in an image 
analysis computer (Leica, Quantimet 570). After 8 images had been taken (measure- 
ment point 4.0 in Fig. 1), nicotine was added to a final concentration of 500 \xM and 
the measurement series was continued. The fluorescence intensity of the cells when 
illuminating with light of 380 nm wavelength was divided by the corresponding in- 

25 tensity at 340 nm and in this way a ratio was formed which represents the relative 
increase in calcium concentration (Grynkiewicz et al. 1985). 
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Patent Claims 

1 . Nucleic acid which comprises a sequence selected from 

(a) the sequences according to SEQ ID NO: 1, SEQ ID NO: 3 or SEQ ID 
NO: 5, 

(b) part sequences, which are least 14 base pairs in length, of the se- 
quences defined under (a), 

(c) sequences which hybridize with the sequences defined under (a) in 2 x 
SSC at 60°C, preferably in 0.5 x SSC at 60°C, particularly preferably 
in 0.2 x SSC at 60°C, 



15 (d) sequences which exhibit at least 70% identity with the sequences de- 

fined under (a), between position 1295 and position 2195 from SEQ 
ID NO: 1, or between position 432 and position 1318 from SEQ ID 
NO: 3, or between position 154 and position 1 123 from SEQ ID NO: 
5, 

20 

(e) sequences which are complementary to the sequences defined under 
(a), and 

(f) sequences which, on account of the degeneracy of the genetic code, 
25 encode the same amino acid sequences as the sequences defined under 

(a) to (d). 



2. 



Vector which comprises at least one nucleic acid according to Claim 1 . 
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3. Vector according to Claim 2, characterized in that the nucleic acid is func- 
tionally linked to regulatory sequences which ensure the expression of the nu- 
cleic acid in prokaryotic or eukaryotic cells. 

5 4. Host cell which contains a nucleic acid according to Claim 1 or a vector ac- 
cording to Claim 2 or 3. 

5. Host cell according to Claim 4, characterized in that it is a prokaryotic or 
eukaryotic cell. 

10 

6. Host cell according to Claim 5, characterized in that the prokaryotic cell is 
E.coli. 

7. Host cell according to Claim 5, characterized in that the eukaryotic cell is a 
15 mammalian cell or an insect cell. 

8. Polypeptide which is encoded by a nucleic acid according to Claim 1 . 

9. Acetylcholine receptor which comprises at least one polypeptide according to 
20 Claim 8. 

10. Process for preparing a polypeptide according to Claim 8, which comprises 

(a) culturing a host cell according to one of Claims 4 to 7 under condi- 
25 tions which ensure the expression of the nucleic acid according to 

Claim 1, and 

(b) isolating the polypeptide from the cell or the culture medium. 

30 11. Antibody which reacts specifically with the polypeptide according to Claim 8 
or the receptor according to Claim 9. 
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12. Transgenic invertebrate which contains a nucleic acid according to Claim 1 . 

13. Transgenic invertebrate according to Claim 12, characterized in that it is 
5 Drosophila melanogaster or Caenorhabditis elegans. 

14. Process for producing a transgenic invertebrate according to Claim 12 or 13, 
which comprises introducing a nucleic acid according to Claim 1 or a vector 
according to Claim 2 or 3. 



10 



15 



15. Transgenic progeny of an invertebrate according to Claim 12 or 13. 

16. Process for preparing a nucleic acid according to Claim 1, which comprises 
the following steps: 

(a) carrying out an entirely chemical synthesis in a manner known per se, 
or 



(b) chemically synthesizing oligonucleotides, labelling the oligonucleo- 
20 tides, hybridizing the oligonucleotides to the DNA of an insect cDNA 

library, selecting positive clones and isolating the hybridizing DNA 
from positive clones, or 

(c) chemically synthesizing oligonucleotides and amplifying the target 
25 DNA by means of PCR. 



17. 



Regulatory region which naturally controls transcription of a nucleic acid 
according to Claim 1 in insect cells and ensures specific expression. 
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Process for discovering novel active compounds for plant protection, in par- 
ticular compounds which alter the conducting properties of receptors accord- 
ing to Claim 9, which comprises the following steps: 

(a) providing a host cell according to one of Claims 4 to 7, 

(b) culturing the host cell in the presence of a compound or a sample 
which comprises a multiplicity of compounds, and 

(c) detecting altered receptor properties. 

Process for discovering a compound which binds to receptors according to 
Claim 9, which encompasses the following steps: 

(a) bringing a host cell according to one of Claims 4 to 7, a polypeptide 
according to Claim 8 or a receptor according to Claim 9 into contact 
with a compound or a mixture of compounds under conditions which 
permit interaction of the compound(s) with the host cell, the polypep- 
tide or the receptor, and 

(b) determining the compound(s) which bind(s) specifically to the recep- 
tors. 

Process for discovering compounds which alter the expression of receptors 
according to Claim 9, which comprises the following steps: 

(a) bringing a host cell according to one of Claims 4 to 7 or a transgenic 
invertebrate according to Claim 11 or 12 into contact with a com- 
pound or a mixture of compounds, 

(b) determining the receptor concentration, and 
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(c) determining the compound(s) which specifically influence(s) the ex- 
pression of the receptor. 

Use of at least one nucleic acid according to Claim 1, one vector according to 
Claim 2 or 3, one regulatory region according to Claim 16 or one antibody 
according to Claim 1 1 for discovering novel active compounds for plant pro- 
tection or for discovering genes which encode polypeptides which are in- 
volved in synthesizing functionally similar acetylcholine receptors in insects. 



5 21. 



10 
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Nucleic acids which encode insect acetylcholine receptor subunits 

Abstract 

The invention relates to nucleic acids which encode insect acetylcholine receptor 
subunits, to the corresponding polypeptides, and to processes for discovering novel 
active compounds for plant protection. 
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Measuring Points 



1 



SEQUENCE LISTING 

<110> Bayer Aktiengesellschaf t 

<120> Nucleic acids which encode 

insect acetylcholine receptor subunits 

<130> Le A 33 020-Foreign Countries 

<140> 

<141> 

<150> DE-198 19 829.9 
<151> 1998-05-04 

<160> 6 

<170> Patentln Ver. 2.1 

<210> 1 
<211> 2886 
<212> DNA 

<213> Drosophila melanogaster 

<220> 
<221> CDS 

<222> (372) . . (2681) 
<400> 1 

ggcacgagaa aaagttgtgg tataaacttt tattgtagga aaacgcataa aaataataga 60 
aaaacgctct tcgggttgta aagaaaataa gaagacaaaa gaaagacatg aaaacgttgc 12 0 



2 



aaacaataaa gcatatactt gccatattga tataaaggga aatcgtgaaa aggcggtgaa 180 

aatttcgtaa gattagttgg tattaagggc agcccatgca cacagctaaa aagggaacta 24 0 

aaaaaacccc gcacagaaca atgaaagctg cagcagctgg ataaggccga caaaaccgaa 3 00 

aattatatta ttgtaatcta gtagagagca gacaacatat ccgctggcaa caaccaacac 3 60 

cgaaagagac t atg aaa aat gca caa ctg aaa ctg act gaa gtt gac gat 410 
Met Lys Asn Ala Gin Leu Lys Leu Thr Glu Val Asp Asp 
15 10 

gat gag ctg tgg ctg gca gta aga tta gcg cac tgc age age aac ttt 458 
Asp Glu Leu Trp Leu Ala Val Arg Leu Ala His Cys Ser Ser Asn Phe 
15 20 25 

age age agt age age aca aga acc ace age age aac cag agg cac aac 506 
Ser Ser Ser Ser Ser Thr Arg Thr Thr Ser Ser Asn Gin Arg His Asn 
30 35 40 45 

cag caa etc aca aca ctg caa cca agg age tta agt aca aaa cac cac 554 
Gin Gin Leu Thr Thr Leu Gin Pro Arg Ser Leu Ser Thr Lys His His 
50 55 60 

age aac att gca age gag cag cac aat age cag caa cag gag cca gca 6 02 
Ser Asn lie Ala Ser Glu Gin His Asn Ser Gin Gin Gin Glu Pro Ala 
65 70 75 

teg aag gac gag gat gta gee aac cac ggt aga age aat gac cag cag 650 
Ser Lys Asp Glu Asp Val Ala Asn His Gly Arg Ser Asn Asp Gin Gin 



3 

80 85 90 

acg cat ctg caa cag eta gac age age aac atg ttg teg cca aag aca 698 
Thr His Leu Gin Gin Leu Asp Ser Ser Asn Met Leu Ser Pro Lys Thr 
95 100 105 

gec gca gca gca act get gee ggc gat gaa gca aca ace caa caa cca 746 
Ala Ala Ala Ala Thr Ala Ala Gly Asp Glu Ala Thr Thr Gin Gin Pro 
110 115 120 125 

aca aac ata aga ctg tgt gca cgc aag cga caa cga ttg cgt cgc cga 7 94 
Thr Asn lie Arg Leu Cys Ala Arg Lys Arg Gin Arg Leu Arg Arg Arg 
130 135 140 

cga aaa aga aaa cca gca acc cca aac gaa aca gat ate aag aaa caa 842 
Arg Lys Arg Lys Pro Ala Thr Pro Asn Glu Thr Asp lie Lys Lys Gin 
145 150 155 

cag caa ctt age atg cct ccc ttc aaa acg cgc aaa tec acg gac acc 8 90 
Gin Gin Leu Ser Met Pro Pro Phe Lys Thr Arg Lys Ser Thr Asp Thr 
160 165 170 

tac age aca cca gca gca aca acc age tgt ccg aca gee acc tac atg 93 8 
Tyr Ser Thr Pro Ala Ala Thr Thr Ser Cys Pro Thr Ala Thr Tyr Met 
175 180 185 

caa tgt cga gee age gac aat gag ttc agt att ccg ata teg aga cat 986 
Gin Cys Arg Ala Ser Asp Asn Glu Phe Ser lie Pro lie Ser Arg His 
190 195 200 205 



gat aga gta tec acg gee aca ttc gee tgg gtg ttg cat gtg ctg cag 



1034 
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Asp Arg Val Ser Thr Ala Thr Phe Ala Trp Val Leu His Val Leu Gin 
210 215 220 



gtg ctg etc gtg teg ctg caa cag tgg caa ctt cac gtg caa cag cga 1082 
Val Leu Leu Val Ser Leu Gin Gin Trp Gin Leu His Val Gin Gin Arg 
225 230 235 



teg gtg eta ctg ttc aga agg ate gca gcg age acc ate gec ttc att 113 0 
Ser Val Leu Leu Phe Arg Arg lie Ala Ala Ser Thr lie Ala Phe lie 
240 245 250 



tec tat tta ggc age ttt gca gcg caa ctg aaa aat age age age age 1178 
Ser Tyr Leu Gly Ser Phe Ala Ala Gin Leu Lys Asn Ser Ser Ser Ser 
255 260 265 



agt age age age aac age age aac aac age age acg caa ata tta aac 12 2 6 
Ser Ser Ser Ser Asn Ser Ser Asn Asn Ser Ser Thr Gin lie Leu Asn 
270 275 280 285 



gga ctt aat aaa cac tea tgg ata ttt tta ttg ata tat ttg aat tta 1274 
Gly Leu Asn Lys His Ser Trp lie Phe Leu Leu lie Tyr Leu Asn Leu 
290 295 300 



tct get aaa gtt tgc eta gca gga tat cat gaa aag aga ctg tta cac 1322 
Ser Ala Lys Val Cys Leu Ala Gly Tyr His Glu Lys Arg Leu Leu His 
305 310 315 



gat ctt ttg gat cct 
Asp Leu Leu Asp Pro 
320 



tat aat aca 
Tyr Asn Thr 
325 



eta gaa cgt ccc gtt 
Leu Glu Arg Pro Val 
330 



etc aat gaa 1370 
Leu Asn Glu 
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teg gac ccg tta caa tta age ttt ggt tta act tta atg caa att ate 1418 
Ser Asp Pro Leu Gin Leu Ser Phe Gly Leu Thr Leu Met Gin lie lie 
335 340 345 

gat gtg gac gag aaa aat caa ttg eta gtc act aat gtg tgg tta aaa 1466 
Asp Val Asp Glu Lys Asn Gin Leu Leu Val Thr Asn Val Trp Leu Lys 
350 355 360 365 

ctg gag tgg aac gac atg aat etc cgc tgg aac acc tec gac tat ggc 1514 
Leu Glu Trp Asn Asp Met Asn Leu Arg Trp Asn Thr Ser Asp Tyr Gly 
370 375 380 

gga gtt aag gat ctg cga ata ccg ccg cat cgc ate tgg aag ccg gac 1562 
Gly Val Lys Asp Leu Arg lie Pro Pro His Arg lie Trp Lys Pro Asp 
385 390 395 

gtg ctg atg tac aac agt gcg gat gag gga ttt gac ggc acc tac cag 1610 
Val Leu Met Tyr Asn Ser Ala Asp Glu Gly Phe Asp Gly Thr Tyr Gin 
400 405 410 

acg aac gtg gtg gtg egg aac aac ggc teg tgt eta tac gtt ccg ccg 165 8 
Thr Asn Val Val Val Arg Asn Asn Gly Ser Cys Leu Tyr Val Pro Pro 
415 420 425 

ggg ate ttc aag teg acg tgc aag ate gac ate acg tgg ttc ccc ttc 17 06 
Gly lie Phe Lys Ser Thr Cys Lys lie Asp lie Thr Trp Phe Pro Phe 
430 435 440 445 



gat gac cag egg tgc gag atg aag ttc ggc agt tgg acc tac gac gga 
Asp Asp Gin Arg Cys Glu Met Lys Phe Gly Ser Trp Thr Tyr Asp Gly 
450 455 460 



1754 
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ttc cag ctg gat tta caa tta caa gat gaa act ggc ggt gat ate age 18 02 
Phe Gin Leu Asp Leu Gin Leu Gin Asp Glu Thr Gly Gly Asp lie Ser 
465 470 475 

agt tac gtg etc aac ggc gag tgg gaa eta ctg ggt gtg ccc ggc aaa 185 0 
Ser Tyr Val Leu Asn Gly Glu Trp Glu Leu Leu Gly Val Pro Gly Lys 
480 485 490 

cgt aac gag ate tat tac aac tgc tgc ccg gaa ccc tat ata gac ate 1898 
Arg Asn Glu lie Tyr Tyr Asn Cys Cys Pro Glu Pro Tyr lie Asp lie 
495 500 505 

ace ttc gec ate ate ate cgc cga cga aca ctg tac tat ttc ttc aac 1946 
Thr Phe Ala lie lie lie Arg Arg Arg Thr Leu Tyr Tyr Phe Phe Asn 
510 515 520 525 

ctg ate ata cct tgt gta ctg att gee tec atg gec ttg etc gga ttc 1994 
Leu lie lie Pro Cys Val Leu lie Ala Ser Met Ala Leu Leu Gly Phe 
530 535 540 

ace ctg ccg cca gat teg ggt gaa aaa tta teg ctg ggt gtt ace ate 2042 
Thr Leu Pro Pro Asp Ser Gly Glu Lys Leu Ser Leu Gly Val Thr lie 
545 550 555 

ttg etc teg ctg ace gtg ttt ctg aat atg gtt gee gag aca atg ccg 2090 
Leu Leu Ser Leu Thr Val Phe Leu Asn Met Val Ala Glu Thr Met Pro 
560 565 570 



get act tec gat gcg gtg cca ttg tgg ata cgc ate gtg ttt ttg tgc 213 8 
Ala Thr Ser Asp Ala Val Pro Leu Trp lie Arg lie Val Phe Leu Cys 
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575 580 585 

tgg ctg cca tgg ata ttg cga atg agt cgc cca gga cga ccg ctg ate 2186 
Trp Leu Pro Trp lie Leu Arg Met Ser Arg Pro Gly Arg Pro Leu lie 
590 595 600 605 

eta gag ttc ccg acc acg ccc tgt teg gac aca tec tec gag egg aag 2234 
Leu Glu Phe Pro Thr Thr Pro Cys Ser Asp Thr Ser Ser Glu Arg Lys 
610 615 620 

cac cag ata etc tec gac gtt gag ctg aaa gag cgc teg teg aaa teg 22 82 
His Gin lie Leu Ser Asp Val Glu Leu Lys Glu Arg Ser Ser Lys Ser 
625 630 635 

ctg ctg gec aac gta eta gac ate gat gat gac ttc egg cac aat tgt 2 33 0 
Leu Leu Ala Asn Val Leu Asp lie Asp Asp Asp Phe Arg His Asn Cys 
640 645 650 

cgc ccc atg acg ccc ggc gga aca ctg cca cac aac ccg get ttc tat 2378 
Arg Pro Met Thr Pro Gly Gly Thr Leu Pro His Asn Pro Ala Phe Tyr 
655 660 665 

cgc acg gtt tat gga caa ggc gac gat ggc age att ggg cca att ggc 242 6 
Arg Thr Val Tyr Gly Gin Gly Asp Asp Gly Ser lie Gly Pro lie Gly 
670 675 680 685 

age acc cga atg ccg gat gcg gtc acc cat cat acg tgc ate aaa tea 24 74 
Ser Thr Arg Met Pro Asp Ala Val Thr His His Thr Cys lie Lys Ser 
690 695 700 



tea act gaa tat gaa tta ggt tta ate tta aag gaa att cgc ttt ata 



2522 
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Ser Thr Glu Tyr Glu Leu Gly Leu lie Leu Lys Glu lie Arg Phe lie 
705 710 715 



act gat cag eta cgt aaa gat gac gag tgc aat gac att gec aat gat 2570 
Thr Asp Gin Leu Arg Lys Asp Asp Glu Cys Asn Asp lie Ala Asn Asp 
720 725 730 



tgg aaa ttt gca get atg gtc gtt gac aga ctg tgc ctt ate ata ttc 2618 
Trp Lys Phe Ala Ala Met Val Val Asp Arg Leu Cys Leu lie lie Phe 
735 740 745 



aca atg ttc gca ata tta gec aca ata get gta eta eta teg gca cca 2666 
Thr Met Phe Ala lie Leu Ala Thr lie Ala Val Leu Leu Ser Ala Pro 
750 755 760 765 



cat att att gtc teg tagecatatg ggcgaggtgg ttattgttat tggttttatt 2721 
His lie lie Val Ser 
770 



ataaaatcaa tttgttaatt attaaattaa taacgaaact ctttaagtaa attaaaacta 2781 



aaaagacact aaaaaagcac aaaaaaatag gaaaatacat gataaaaccc atgaactaaa 2841 



taatacatcc aagaaaaacc aaaacaaaaa aaaaaaaaaa aaaaa 



2886 
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<210> 3 
<211> 3701 
<212> DNA 

<213> Heliothis virescens 

<220> 
<221> CDS 

<222> (335) . . (1822) 
<400> 3 

ggcacgagcc gctgccccac ggtcggccgc actccgctga acaacaatgc tcaaaaacac 6 0 

gccgtgactc cacacacatc ccctcggcgc agtaggcgat gtttgaggat cggacggcac 120 

gcgtggccgt cggcgagcgg tcgtgaacaa gttgcataca tatgaaaacc gtaaaaagat 18 0 

tgaattttaa gccgatcgtg ttcgatagat cctaatagag aagcgggagt gcggcgtttg 24 0 

gtaggcgggg gtcgagtcgc gcggtcgggg gaaatggcgc ggcgcggggc ggcggcggcg 300 

gcggcgcgcg gcgcggcggc gtcgcggcgc tgac atg ggc ggg egg gcg cgc cgc 355 

Met Gly Gly Arg Ala Arg Arg 
1 5 

teg cac ttg gcg gcg ccc gcg ggc ctg ctg ctg ctg ctg tgc ctg etc 403 
Ser His Leu Ala Ala Pro Ala Gly Leu Leu Leu Leu Leu Cys Leu Leu 
10 15 20 

tgg ccg agg ggg gca cgc tgc ggg tac cac gag aag egg eta ctg cac 451 
Trp Pro Arg Gly Ala Arg Cys Gly Tyr His Glu Lys Arg Leu Leu His 
25 30 35 



10 



cac eta ttg gac cac tac aac gta ctg gag agg ccc gtc gtc aac gag 499 
His Leu Leu Asp His Tyr Asn Val Leu Glu Arg Pro Val Val Asn Glu 
40 45 50 55 



age gac ccg ctg cag etc tec ttc ggc etc acg etc atg cag ate ate 547 
Ser Asp Pro Leu Gin Leu Ser Phe Gly Leu Thr Leu Met Gin lie lie 
60 65 70 



gac gtg gac gag aag aac cag ctt tta ata aca aac ate tgg eta aaa 5 95 
Asp Val Asp Glu Lys Asn Gin Leu Leu lie Thr Asn lie Trp Leu Lys 
75 80 85 



eta gag tgg aat gat atg aac ttg agg tgg aac act tea gat ttc ggc 643 
Leu Glu Trp Asn Asp Met Asn Leu Arg Trp Asn Thr Ser Asp Phe Gly 
90 95 100 



999 9 tc aaa 9 a t tta aga gtg cca ccc cac aga eta tgg aaa cca gac 691 
Gly Val Lys Asp Leu Arg Val Pro Pro His Arg Leu Trp Lys Pro Asp 
105 110 115 



gtc ctt atg tac aac age gcg gac gaa ggg ttc gac age acg tat cca 73 9 
Val Leu Met Tyr Asn Ser Ala Asp Glu Gly Phe Asp Ser Thr Tyr Pro 
120 125 130 135 



acg aac gtg gtg gtg egg aac aac ggc teg tgt ctg tac gtg ccg ccc 787 
Thr Asn Val Val Val Arg Asn Asn Gly Ser Cys Leu Tyr Val Pro Pro 
140 145 150 



ggc ate ttc aag age acc tgc aag ate gac ate ace tgg ttc ccc ttc 83 5 
Gly lie Phe Lys Ser Thr Cys Lys lie Asp lie Thr Trp Phe Pro Phe 
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155 160 165 

gac gac caa cga tgc gag atg aag ttt ggc age tgg act tat gat ggt 8 83 
Asp Asp Gin Arg Cys Glu Met Lys Phe Gly Ser Trp Thr Tyr Asp Gly 
170 175 180 

tat cag ttg gat eta caa eta cag gat gaa ggg ggc gga gat ata age 931 
Tyr Gin Leu Asp Leu Gin Leu Gin Asp Glu Gly Gly Gly Asp lie Ser 
185 190 195 

agt ttt gtc acg aat ggc gaa tgg gag tta ata gga gtc ccc ggc aag 979 
Ser Phe Val Thr Asn Gly Glu Trp Glu Leu lie Gly Val Pro Gly Lys 
200 205 210 215 

cgc aac gag ate tac tac aac tgt tgt ccg gag cca tac ate gac ate 1027 
Arg Asn Glu lie Tyr Tyr Asn Cys Cys Pro Glu Pro Tyr lie Asp lie 
220 225 230 

acg ttt gcg gtg gtg ate egg agg aaa acg etc tac tac ttc ttc aat 1075 
Thr Phe Ala Val Val lie Arg Arg Lys Thr Leu Tyr Tyr Phe Phe Asn 
235 240 245 

ctg ate gtg ccc tgc gtg etc ate gee tec atg get eta ttg ggg ttc 1123 
Leu lie Val Pro Cys Val Leu lie Ala Ser Met Ala Leu Leu Gly Phe 
250 255 260 

ace ttg cct cca gac tec gga gaa aag ttg tct tta ggt gtg acg ata 1171 
Thr Leu Pro Pro Asp Ser Gly Glu Lys Leu Ser Leu Gly Val Thr lie 
265 270 275 



tta ctg teg ttg acg gtg ttc etc aac atg gtg gcg gag acg atg cca 1219 
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Leu Leu Ser Leu Thr Val Phe Leu Asn Met Val Ala Glu Thr Met Pro 
280 285 290 295 

gcg acg teg gac gec gtg ccc ttg etc ggc acc tac ttc aac tgc ate 1267 
Ala Thr Ser Asp Ala Val Pro Leu Leu Gly Thr Tyr Phe Asn Cys lie 
300 305 310 

atg ttc atg gtg get tec tec gtc gtc tec acc ata ctg ate etc aac 1315 
Met Phe Met Val Ala Ser Ser Val Val Ser Thr He Leu He Leu Asn 
315 320 325 

tac cac cac egg cac gca gac act cac gaa atg agt gat tgg att cgt 13 63 
Tyr His His Arg His Ala Asp Thr His Glu Met Ser Asp Trp He Arg 
330 335 340 

tgc gtg ttc ctt tat tgg ctg ccg tgg gtg ctg cgc atg tea egg ccc 1411 
Cys Val Phe Leu Tyr Trp Leu Pro Trp Val Leu Arg Met Ser Arg Pro 
345 350 355 

ggc teg gcg acg acg ccg ccg ccg gcg cgc gta cct ccg ccg ccg gac 14 59 
Gly Ser Ala Thr Thr Pro Pro Pro Ala Arg Val Pro Pro Pro Pro Asp 
360 365 370 375 

ctg gag ctg cgc gag cgc tec tec aag teg etc eta gcg aac gtg etc 1507 
Leu Glu Leu Arg Glu Arg Ser Ser Lys Ser Leu Leu Ala Asn Val Leu 
380 385 390 

gac ate gat gac gac ttc cgc cac ccg caa gcg cag cag ccg caa tgc 1555 
Asp He Asp Asp Asp Phe Arg His Pro Gin Ala Gin Gin Pro Gin Cys 
395 400 405 
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tgc cga tac tac agg ggg ggt gag gag aat ggc gcg ggg ttg gcg gcg 1603 
Cys Arg Tyr Tyr Arg Gly Gly Glu Glu Asn Gly Ala Gly Leu Ala Ala 
410 415 420 

cac agt tgc ttc ggt gtc gac tac gag etc tec etc att ctg aag gag 1651 
His Ser Cys Phe Gly Val Asp Tyr Glu Leu Ser Leu lie Leu Lys Glu 
425 430 435 

att aga gtc ate aca gat cag atg cgc aag gac gac gaa gat gcg gac 1699 
lie Arg Val lie Thr Asp Gin Met Arg Lys Asp Asp Glu Asp Ala Asp 
440 445 450 455 

att teg cgc gac tgg aag ttc gee gee atg gtc gtg gac aga ctg tgc 1747 
lie Ser Arg Asp Trp Lys Phe Ala Ala Met Val Val Asp Arg Leu Cys 
460 465 470 

ctt att ate ttt ace ctg ttc aca ate ate gee acg eta gee gtg ctg 1795 
Leu lie lie Phe Thr Leu Phe Thr lie lie Ala Thr Leu Ala Val Leu 
475 480 485 

ctg tec gcg cca cac ate atg gtg teg tagcgacccg cccgcttgcg 1842 
Leu Ser Ala Pro His lie Met Val Ser 
490 495 

gataegcatg cgaaaagttc tgtgataccg cgaatatttg ttaagttgtg atgagcgaag 1902 

tggegeggae ggtgacgccg eggegtegga gttgccgccg cctgcctcgc cgcccgcgcc 1962 

cccctgtaga cataagttac cgctgactgc caaccctgta cgttcaacaa ataactgccc 2022 



atccgactaa cgtcttttat ccccttgaaa aattcagega ttgtgtaccc ctttcttcca 2082 
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agaatacaat gacaaatggt cgtcacgctc agtggaatca atcccgtact cttcgcccga 2142 



tatttccctt agggtatgtc acgagtttga atgagcggtt ccgtatcaga cgttccgtcc 22 02 



ccggaacggt cgtcccctgc gataaagtgg cagtacgtgc tatacaggca cttaaggccg 2262 



ccacgccacg gcgccgcggt gcgctcgggc cgcgaacccg cgaccctcac cgctgcaagt 23 22 



ggccacccac tagacaagac tgcggcagaa aatatttgca caaaaacgtc ttccttctta 23 82 



ccgatgaacg acctgattcg catttaaaat taaactttgt tagaacttct tcgattcttg 2442 



aaatctattg tacagtttag agtttgggcg gtgaaacaat ggccctttgt ttccttcttg 25 02 



ttcgattcca tgaatcgtgg ttataatccc tagttttatt ttcggatata tttgtgtcag 2562 



tagctagtat agaactttac aaacaatgtt gattcaattg gtacaggttg tgatatgcct 2 622 



cgttgtgaac gggtccgata ttgttataaa tggtaaaata cccatggcta tagcttaata 2682 



aatcgttcgt taaaagttgt agttaaacaa atattatttt aataaagtca tatctgggtc 2742 



ttccggaacg acttttacaa ataattaaat tacatattaa tatcacgttt gtacttcttt 2802 



ccatacagtt acagtaattc gtatgctgaa aataatatta gcttgtaaaa ttttcttctt 2862 



cgaaaattta ttcaaacaga tgcgaccatc gtttcaaaca tttacatgta atatagaact 2922 



cattttataa gatatacaac attttataag tacaagaagt tgtaacatga accggttttt 2 982 
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cgttacatag agggtataac acaaaggtgc ctacatattg acagatgcga agcacgatca 3 042 
gttgataagc acaggtacac tatatcctga catccgacag tcctgccgct cgtctgccac 3102 
actcggaaac attcgacagt tcagtttact gctccgccat catcgattgt taagtttgtt 3162 
gttctaactc atcgcattca tttcattcaa aaacattgta aacctctcaa ggggaaaacg 3222 
tgttgtaaac agtgagagtg cgcgggtaca accgacacgc gaatgtaccc tcgcaaggct 32 82 
cctgtaatgt tttcctcttc cgaggtgttg ctgagagtaa tcttagacgg tccgatggaa 3342 
gttgcggacc ggatatgatt acaagtcaat gtttttaagt catccgttta tttattgtta 34 02 
tatcttctta ccattcgcta gaggttgtgt gacgacccgg acggtgggcg ccgcaacccg 3462 
cacacgcggg gttccatctt tgtattagat ggaagttgtg cggcatctct ccgtcggcaa 3522 
tgggacaacc cgttgtcccc aacatttgtt caattgttag ggttaactct gaattgcact 3 582 
ttgtttatta aatataaacg aatgaaacaa aaaaaaaaaa aaaaaactcg agagtacttc 3 642 
tagagcggcc gcgggcccat cgattttcca cccgggtggg gtaccargta agtgtaccc 3 701 
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<210> 5 
<211> 3109 
<212> DNA 

<213> Heliothis virescens 

<220> 

<221> CDS 

<222> (95) . . (1597) 

<400> 5 

ggcacgagcc ggccgcacgt tgtcccaggc cgcatgagcg cgccggcgtg ctagcgcagc 60 

gtgcgcgggt gtggtatgcc cgcgcgtcgc cgct atg gcc cct atg ttg gcg gcc 115 

Met Ala Pro Met Leu Ala Ala 
1 5 

ttg gcg ctg ctg get ttg ctg ccc gta teg gag caa ggt cct cac gag 163 
Leu Ala Leu Leu Ala Leu Leu Pro Val Ser Glu Gin Gly Pro His Glu 
10 15 20 

aag aga etc ctg aac gcg ttg ctg gcg aac tac aac acc ctg gag cga 211 
Lys Arg Leu Leu Asn Ala Leu Leu Ala Asn Tyr Asn Thr Leu Glu Arg 
25 30 35 

cc 9 gtg gcc aac gag age gaa ccg eta gag gtc agg ttc ggc ttg acc 259 
Pro Val Ala Asn Glu Ser Glu Pro Leu Glu Val Arg Phe Gly Leu Thr 
40 45 50 55 

ttg cag caa ate att gac gtg gac gag aag aat caa eta ctt ata acc 3 07 



17 

Leu Gin Gin lie lie Asp Val Asp Glu Lys Asn Gin Leu Leu lie Thr 
60 65 70 

aat ata tgg ctg teg ttg gag tgg aat gac tac aac ctg agg tgg aac 355 
Asn lie Trp Leu Ser Leu Glu Trp Asn Asp Tyr Asn Leu Arg Trp Asn 
75 80 85 

gac age gag tat ggc ggg gtc aag gac etc agg ate acg ccc aac aag 4 03 
Asp Ser Glu Tyr Gly Gly Val Lys Asp Leu Arg He Thr Pro Asn Lys 
90 95 100 

ttg tgg aag ccg gac gtc ctt atg tat aat agt get gac gag ggt ttt 451 
Leu Trp Lys Pro Asp Val Leu Met Tyr Asn Ser Ala Asp Glu Gly Phe 
105 110 115 

gac ggg acc tac cag acc aac gtg gtg gtc aga age ggc ggc agt tgc 4 99 
Asp Gly Thr Tyr Gin Thr Asn Val Val Val Arg Ser Gly Gly Ser Cys 
120 125 130 135 

ctg tac gtg cca cct ggc ata ttc aag age aca tgc aag atg gac ate 547 
Leu Tyr Val Pro Pro Gly He Phe Lys Ser Thr Cys Lys Met Asp He 
140 145 150 

gcg tgg ttt ccc ttc gac gac caa cac tgt gat atg aag ttc ggt age 5 95 
Ala Trp Phe Pro Phe Asp Asp Gin His Cys Asp Met Lys Phe Gly Ser 
155 160 165 



tgg aca tat gac ggc aat cag ttg gat ctg gtg eta aaa gat gag gca 643 
Trp Thr Tyr Asp Gly Asn Gin Leu Asp Leu Val Leu Lys Asp Glu Ala 
170 175 180 
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ggc gat eta teg gac ttc ata aca aat ggg gag tgg tat eta ata 
Gly Gly Asp Leu Ser Asp Phe lie Thr Asn Gly Glu Trp Tyr Leu lie 
185 190 195 



gga atg cca ggc aaa aag aac aca ata aca tac gcg tgc tgc ccc gag 
Gly Met Pro Gly Lys Lys Asn Thr lie Thr Tyr Ala Cys Cys Pro Glu 
200 205 210 215 



ccc tac gtg gac gtc acc ttc acc ate atg ata aga aga cga ace ttg 
Pro Tyr Val Asp Val Thr Phe Thr lie Met lie Arg Arg Arg Thr Leu 
220 225 230 



tac tac ttc ttc aac ctg ate gtc ccg tgc gtg ctg ate tea teg atg 
Tyr Tyr Phe Phe Asn Leu lie Val Pro Cys Val Leu lie Ser Ser Met 
235 240 245 



gca etc etc ggc ttc aca ctg cca cca gac tec gga gag aaa etc aca 
Ala Leu Leu Gly Phe Thr Leu Pro Pro Asp Ser Gly Glu Lys Leu Thr 
250 255 260 



ctt gga gtc act att ctt eta teg ctg acg gtg ttc etc aac ctg gta 
Leu Gly Val Thr lie Leu Leu Ser Leu Thr Val Phe Leu Asn Leu Val 
265 270 275 

gec gag acc ctg cca cag gtc tec gac get ate ccc ctg tta ggg acg 
Ala Glu Thr Leu Pro Gin Val Ser Asp Ala lie Pro Leu Leu Gly Thr 
280 285 290 295 

tac ttc aat tgc ate atg ttc atg gta gcg teg tct gtg gta ctg act 
Tyr Phe Asn Cys lie Met Phe Met Val Ala Ser Ser Val Val Leu Thr 
300 305 310 



739 



787 



835 



883 



931 



979 



1027 
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gtg gtg gta etc aat tac cac cat cga aca get gat ata cat gaa atg 1075 

Val Val Val Leu Asn Tyr His His Arg Thr Ala Asp lie His Glu Met 
315 320 325 

cca cag tgg ata aaa tea gta ttc eta caa tgg ttg cca tgg ata ctg 1123 

Pro Gin Trp lie Lys Ser Val Phe Leu Gin Trp Leu Pro Trp lie Leu 

330 335 340 

cga atg teg agg cca ggg aag aag ate ace agg aag act ata atg atg 1171 

Arg Met Ser Arg Pro Gly Lys Lys lie Thr Arg Lys Thr lie Met Met 
345 350 355 



aac acg agg atg agg gag ctg gaa ctg aag gag agg teg teg aag tec 1219 
Asn Thr Arg Met Arg Glu Leu Glu Leu Lys Glu Arg Ser Ser Lys Ser 
360 365 370 375 

ttg ctg gcg aat gtt eta gat att gat gat gac ttc aga cac ggc cct 1267 
Leu Leu Ala Asn Val Leu Asp lie Asp Asp Asp Phe Arg His Gly Pro 
380 385 390 

ccg cct cct aac agt act gee teg acc ggg aat ttg gga cct ggg tgc 1315 
Pro Pro Pro Asn Ser Thr Ala Ser Thr Gly Asn Leu Gly Pro Gly Cys 
395 400 405 

tea ata ttc cgc acg gat ttc cgt egg teg ttc gtc cgt ccg tec acg 1363 
Ser He Phe Arg Thr Asp Phe Arg Arg Ser Phe Val Arg Pro Ser Thr 
410 415 420 



atg gaa gac gtg ggc ggc ggg ctg ggt age cac cat cgc gag ctg cac 
Met Glu Asp Val Gly Gly Gly Leu Gly Ser His His Arg Glu Leu His 



1411 
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425 



430 



435 



etc ata ctg aga gag ctg cag ttc ate acg gec agg atg aag aag get 145 9 
Leu lie Leu Arg Glu Leu Gin Phe lie Thr Ala Arg Met Lys Lys Ala 
440 445 450 455 



gat gag gaa gec gag ctg ate age gac 
Asp Glu Glu Ala Glu Leu lie Ser Asp 
460 



tgg aag ttt get gcg atg gtt 1507 
Trp Lys Phe Ala Ala Met Val 
465 470 



gtt gat agg ttt tgc ctg ttc gtg ttc aca ctt ttc aca ate ate gcg 1555 
Val Asp Arg Phe Cys Leu Phe Val Phe Thr Leu Phe Thr lie lie Ala 
475 480 485 



aca gta get gtc ctg tta teg gca ccg cat ate ate gtg caa 15 97 

Thr Val Ala Val Leu Leu Ser Ala Pro His lie lie Val Gin 
490 495 500 



tgaaccaacc actgagcegg caactccggc gcatgaatga gagaaataat tattagatcg 1657 



ccgatttgta attataattg ataatgtaat taaattaaat acgtggttga aacgcacacg 1717 



tctccataac aaagtcttaa gacattaaat tatgataaat ttacatattg tagttaagtc 1777 
gagtgttgat ggaaatttta gccggcgcaa ggagtttcgt gaaggtctgt atatattttt 1837 



tcttattgtt gtatattgta tcgttgttca tgttttcttt caggaagtga gctttgtact 1897 



gtttgtttct tcgatggcag gtgeacttea gttcaggctg aaatttccat taacatttat 1957 



ttaaacaaat gtgatgttga ctaggatgtt atacagataa atgttgacgt gtataatttg 2017 
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ttaaaataaa caatattaat tactattact aaacgatatt ataaacgaag tactaacgag 2077 



ggttacttta atgggaagaa cgctaagctg gcacagagtt gcattaattt gaaaaaagaa 213 7 



attacggaaa aaagtttatt gaaaattgaa ctttttggaa ggaaagtaac gtttgatcaa 2197 



aaaagtttgt aaaacgaaag ttcggttctg cgccaatact ggaattaaaa ttctcgtaaa 2257 



tattagggaa aagaaggtcc tttaaaacaa aagatttgaa ccggcatcct ttttacaagt 2317 



aatgagggat cacagatgat gacaaaaaac cttagggtat ataagtaatg tacataatgg 23 77 



atcaaatatc ggtagagtca agaatagtta acgatttaag attattccat tcgatattaa 243 7 



aattcgatta gcgattgtcg ctgcgtctac tttgatacat atcgatttga atcgatattg 2497 



tataaattta gatagatcgg acattagtaa tgagtatgga cgttttaatt tttaaaaaag 2557 



aatgtactac gaagattaaa tccaggaatt gttaaacagt tatggaattg ataagaaatc 2617 



caattaat acggaaccaa aggtagacta ggtgtagcat caggagattg aattaaaaca 267 7 



taaattagga ccgacttaaa tggaacttgc gagtgtattg ataacttttt aatttaaaaa 2737 



ctcattgtcg attaaatgga gaataacttt tgatctctcg tatcgataaa tgctcactta 2797 



tatcgata gcgtaatatt ataactgtta gtatatcgat atgggagtaa gtcactagca 2 857 



tcagaaatag tcattaatta ggaatcggtt tgtgttaatg ttatgcttag cgaaaatatt 2 917 



acaatgctgt tgatatcact aaccatcacg 



tattgcggtg tgtatttgta tataaatttt 



tacttctaga gcggccgcgg gcccatcgat 



tacccaattc gc 
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taaccatatt gataaaatgt aaatacagaa 2 977 



agaaaaaaaa aaaaaaaaaa aactcgagag 3 037 



tttccacccg ggtggggtac caggtaagtg 3 097 



3109 



